Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
1.
JAMA Netw Open ; 6(4): e238203, 2023 04 03.
Article in English | MEDLINE | ID: covidwho-2291703

ABSTRACT

This cohort study uses hospitalization and 30-day mortality risks to create a temporal profile of the severity of COVID-19 in Massachusetts from July 2021 to December 2022.


Subject(s)
COVID-19 , Humans , Massachusetts/epidemiology , SARS-CoV-2
2.
J Biomed Inform ; 139: 104306, 2023 03.
Article in English | MEDLINE | ID: covidwho-2220929

ABSTRACT

BACKGROUND: In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as ​​reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients. METHODS: We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern. RESULTS: With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors. CONCLUSION: In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.


Subject(s)
COVID-19 , Electronic Health Records , Humans , Data Collection , Records , Cluster Analysis
3.
JAMA Netw Open ; 5(10): e2238354, 2022 10 03.
Article in English | MEDLINE | ID: covidwho-2084942

ABSTRACT

Importance: The SARS-CoV-2 Omicron subvariant, BA.2, may be less severe than previous variants; however, confounding factors make interpreting the intrinsic severity challenging. Objective: To compare the adjusted risks of mortality, hospitalization, intensive care unit admission, and invasive ventilation between the BA.2 subvariant and the Omicron and Delta variants, after accounting for multiple confounders. Design, Setting, and Participants: This was a retrospective cohort study that applied an entropy balancing approach. Patients in a multicenter inpatient and outpatient system in New England with COVID-19 between March 3, 2020, and June 20, 2022, were identified. Exposures: Cases were assigned as being exposed to the Delta (B.1.617.2) variant, the Omicron (B.1.1.529) variant, or the Omicron BA.2 lineage subvariants. Main Outcomes and Measures: The primary study outcome planned before analysis was risk of 30-day mortality. Secondary outcomes included the risks of hospitalization, invasive ventilation, and intensive care unit admissions. Results: Of 102 315 confirmed COVID-19 cases (mean [SD] age, 44.2 [21.6] years; 63 482 women [62.0%]), 20 770 were labeled as Delta variants, 52 605 were labeled as the Omicron B.1.1.529 variant, and 28 940 were labeled as Omicron BA.2 subvariants. Patient cases were excluded if they occurred outside the prespecified temporal windows associated with the variants or had minimal longitudinal data in the Mass General Brigham system before COVID-19. Mortality rates were 0.7% for Delta (B.1.617.2), 0.4% for Omicron (B.1.1.529), and 0.3% for Omicron (BA.2). The adjusted odds ratio of mortality from the Delta variant compared with the Omicron BA.2 subvariants was 2.07 (95% CI, 1.04-4.10) and that of the original Omicron variant compared with the Omicron BA.2 subvariant was 2.20 (95% CI, 1.56-3.11). For all outcomes, the Omicron BA.2 subvariants were significantly less severe than that of the Omicron and Delta variants. Conclusions and Relevance: In this cohort study, after having accounted for a variety of confounding factors associated with SARS-CoV-2 outcomes, the Omicron BA.2 subvariant was found to be intrinsically less severe than both the Delta and Omicron variants. With respect to these variants, the severity profile of SARS-CoV-2 appears to be diminishing after taking into account various factors including therapeutics, vaccinations, and prior infections.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Female , Adult , COVID-19/epidemiology , Cohort Studies , Retrospective Studies , New England/epidemiology
4.
J Biomed Inform ; 133: 104147, 2022 09.
Article in English | MEDLINE | ID: covidwho-1959659

ABSTRACT

OBJECTIVE: The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. METHODS: The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. RESULTS: With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. CONCLUSIONS: The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes.


Subject(s)
COVID-19 , Electronic Health Records , Algorithms , Humans , Logical Observation Identifiers Names and Codes , Pattern Recognition, Automated
5.
NPJ Digit Med ; 5(1): 81, 2022 Jun 29.
Article in English | MEDLINE | ID: covidwho-1908301

ABSTRACT

The risk profiles of post-acute sequelae of COVID-19 (PASC) have not been well characterized in multi-national settings with appropriate controls. We leveraged electronic health record (EHR) data from 277 international hospitals representing 414,602 patients with COVID-19, 2.3 million control patients without COVID-19 in the inpatient and outpatient settings, and over 221 million diagnosis codes to systematically identify new-onset conditions enriched among patients with COVID-19 during the post-acute period. Compared to inpatient controls, inpatient COVID-19 cases were at significant risk for angina pectoris (RR 1.30, 95% CI 1.09-1.55), heart failure (RR 1.22, 95% CI 1.10-1.35), cognitive dysfunctions (RR 1.18, 95% CI 1.07-1.31), and fatigue (RR 1.18, 95% CI 1.07-1.30). Relative to outpatient controls, outpatient COVID-19 cases were at risk for pulmonary embolism (RR 2.10, 95% CI 1.58-2.76), venous embolism (RR 1.34, 95% CI 1.17-1.54), atrial fibrillation (RR 1.30, 95% CI 1.13-1.50), type 2 diabetes (RR 1.26, 95% CI 1.16-1.36) and vitamin D deficiency (RR 1.19, 95% CI 1.09-1.30). Outpatient COVID-19 cases were also at risk for loss of smell and taste (RR 2.42, 95% CI 1.90-3.06), inflammatory neuropathy (RR 1.66, 95% CI 1.21-2.27), and cognitive dysfunction (RR 1.18, 95% CI 1.04-1.33). The incidence of post-acute cardiovascular and pulmonary conditions decreased across time among inpatient cases while the incidence of cardiovascular, digestive, and metabolic conditions increased among outpatient cases. Our study, based on a federated international network, systematically identified robust conditions associated with PASC compared to control groups, underscoring the multifaceted cardiovascular and neurological phenotype profiles of PASC.

6.
NPJ Digit Med ; 5(1): 74, 2022 Jun 13.
Article in English | MEDLINE | ID: covidwho-1890276

ABSTRACT

Given the growing number of prediction algorithms developed to predict COVID-19 mortality, we evaluated the transportability of a mortality prediction algorithm using a multi-national network of healthcare systems. We predicted COVID-19 mortality using baseline commonly measured laboratory values and standard demographic and clinical covariates across healthcare systems, countries, and continents. Specifically, we trained a Cox regression model with nine measured laboratory test values, standard demographics at admission, and comorbidity burden pre-admission. These models were compared at site, country, and continent level. Of the 39,969 hospitalized patients with COVID-19 (68.6% male), 5717 (14.3%) died. In the Cox model, age, albumin, AST, creatine, CRP, and white blood cell count are most predictive of mortality. The baseline covariates are more predictive of mortality during the early days of COVID-19 hospitalization. Models trained at healthcare systems with larger cohort size largely retain good transportability performance when porting to different sites. The combination of routine laboratory test values at admission along with basic demographic features can predict mortality in patients hospitalized with COVID-19. Importantly, this potentially deployable model differs from prior work by demonstrating not only consistent performance but also reliable transportability across healthcare systems in the US and Europe, highlighting the generalizability of this model and the overall approach.

7.
J Med Internet Res ; 24(5): e37931, 2022 05 18.
Article in English | MEDLINE | ID: covidwho-1862520

ABSTRACT

BACKGROUND: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 versus incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification. OBJECTIVE: The aims of this study are to, first, quantify the frequency of incidental hospitalizations over the first 15 months of the pandemic in multiple hospital systems in the United States and, second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification. METHODS: From a retrospective EHR-based cohort in 4 US health care systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1123 SARS-CoV-2 PCR-positive patients hospitalized from March 2020 to August 2021 was manually chart-reviewed and classified as "admitted with COVID-19" (incidental) versus specifically admitted for COVID-19 ("for COVID-19"). EHR-based phenotyping was used to find feature sets to filter out incidental admissions. RESULTS: EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0% to 75%). The top site-specific feature sets had 79%-99% specificity with 62%-75% sensitivity, while the best-performing across-site feature sets had 71%-94% specificity with 69%-81% sensitivity. CONCLUSIONS: A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.


Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/diagnosis , COVID-19/epidemiology , Electronic Health Records , Hospitalization , Humans , Retrospective Studies
8.
J Am Med Inform Assoc ; 29(8): 1334-1341, 2022 07 12.
Article in English | MEDLINE | ID: covidwho-1831208

ABSTRACT

OBJECTIVE: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models. MATERIALS AND METHODS: Using data from over 56 000 Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in 4 AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records. Models were evaluated both retrospectively and prospectively using model-level metrics of discrimination, accuracy, and reliability, and a novel individual-level metric for error. RESULTS: We found inconsistent instances of model-level bias in the prediction models. From an individual-level aspect, however, we found most all models performing with slightly higher error rates for older patients. DISCUSSION: While a model can be biased against certain protected groups (ie, perform worse) in certain tasks, it can be at the same time biased towards another protected group (ie, perform better). As such, current bias evaluation studies may lack a full depiction of the variable effects of a model on its subpopulations. CONCLUSION: Only a holistic evaluation, a diligent search for unrecognized bias, can provide enough information for an unbiased judgment of AI bias that can invigorate follow-up investigations on identifying the underlying roots of bias and ultimately make a change.


Subject(s)
COVID-19 , Artificial Intelligence , Humans , Reproducibility of Results , Retrospective Studies , SARS-CoV-2
10.
Gen Hosp Psychiatry ; 74: 9-17, 2022.
Article in English | MEDLINE | ID: covidwho-1568701

ABSTRACT

OBJECTIVE: To validate a previously published machine learning model of delirium risk in hospitalized patients with coronavirus disease 2019 (COVID-19). METHOD: Using data from six hospitals across two academic medical networks covering care occurring after initial model development, we calculated the predicted risk of delirium using a previously developed risk model applied to diagnostic, medication, laboratory, and other clinical features available in the electronic health record (EHR) at time of hospital admission. We evaluated the accuracy of these predictions against subsequent delirium diagnoses during that admission. RESULTS: Of the 5102 patients in this cohort, 716 (14%) developed delirium. The model's risk predictions produced a c-index of 0.75 (95% CI, 0.73-0.77) with 27.7% of cases occurring in the top decile of predicted risk scores. Model calibration was diminished compared to the initial COVID-19 wave. CONCLUSION: This EHR delirium risk prediction model, developed during the initial surge of COVID-19 patients, produced consistent discrimination over subsequent larger waves; however, with changing cohort composition and delirium occurrence rates, model calibration decreased. These results underscore the importance of calibration, and the challenge of developing risk models for clinical contexts where standard of care and clinical populations may shift.


Subject(s)
COVID-19 , Delirium , Delirium/diagnosis , Delirium/epidemiology , Electronic Health Records , Hospitalization , Humans , Retrospective Studies , SARS-CoV-2
11.
BMC Med ; 19(1): 249, 2021 09 27.
Article in English | MEDLINE | ID: covidwho-1496168

ABSTRACT

BACKGROUND: For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of PASC phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. METHODS: In this retrospective electronic health record (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston Metropolitan Area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. RESULTS: We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients' medical records 2 months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR 2.60, 95% CI [1.94-3.46]), alopecia (OR 3.09, 95% CI [2.53-3.76]), chest pain (OR 1.27, 95% CI [1.09-1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22-1.64]), pneumonia (OR 1.66, 95% CI [1.28-2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22-1.64]) is one of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. CONCLUSIONS: The findings of this study confirm many of the post-COVID-19 symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63% of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.


Subject(s)
COVID-19 , COVID-19/complications , COVID-19/diagnosis , Humans , Phenotype , Retrospective Studies , Post-Acute COVID-19 Syndrome
12.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Article in English | MEDLINE | ID: covidwho-1463405

ABSTRACT

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Subject(s)
COVID-19 , Pandemics , Adult , Aged , Female , Hospitalization , Hospitals , Humans , Male , Middle Aged , Retrospective Studies , SARS-CoV-2
13.
JAMIA Open ; 4(2): ooab036, 2021 Apr.
Article in English | MEDLINE | ID: covidwho-1266122

ABSTRACT

Clinical data networks that leverage large volumes of data in electronic health records (EHRs) are significant resources for research on coronavirus disease 2019 (COVID-19). Data harmonization is a key challenge in seamless use of multisite EHRs for COVID-19 research. We developed a COVID-19 application ontology in the national Accrual to Clinical Trials (ACT) network that enables harmonization of data elements that are critical to COVID-19 research. The ontology contains over 50 000 concepts in the domains of diagnosis, procedures, medications, and laboratory tests. In particular, it has computational phenotypes to characterize the course of illness and outcomes, derived terms, and harmonized value sets for severe acute respiratory syndrome coronavirus 2 laboratory tests. The ontology was deployed and validated on the ACT COVID-19 network that consists of 9 academic health centers with data on 14.5M patients. This ontology, which is freely available to the entire research community on GitHub at https://github.com/shyamvis/ACT-COVID-Ontology, will be useful for harmonizing EHRs for COVID-19 research beyond the ACT network.

14.
JAMA Netw Open ; 4(6): e2112596, 2021 06 01.
Article in English | MEDLINE | ID: covidwho-1265355

ABSTRACT

Importance: Additional sources of pediatric epidemiological and clinical data are needed to efficiently study COVID-19 in children and youth and inform infection prevention and clinical treatment of pediatric patients. Objective: To describe international hospitalization trends and key epidemiological and clinical features of children and youth with COVID-19. Design, Setting, and Participants: This retrospective cohort study included pediatric patients hospitalized between February 2 and October 10, 2020. Patient-level electronic health record (EHR) data were collected across 27 hospitals in France, Germany, Spain, Singapore, the UK, and the US. Patients younger than 21 years who tested positive for COVID-19 and were hospitalized at an institution participating in the Consortium for Clinical Characterization of COVID-19 by EHR were included in the study. Main Outcomes and Measures: Patient characteristics, clinical features, and medication use. Results: There were 347 males (52%; 95% CI, 48.5-55.3) and 324 females (48%; 95% CI, 44.4-51.3) in this study's cohort. There was a bimodal age distribution, with the greatest proportion of patients in the 0- to 2-year (199 patients [30%]) and 12- to 17-year (170 patients [25%]) age range. Trends in hospitalizations for 671 children and youth found discrete surges with variable timing across 6 countries. Data from this cohort mirrored national-level pediatric hospitalization trends for most countries with available data, with peaks in hospitalizations during the initial spring surge occurring within 23 days in the national-level and 4CE data. A total of 27 364 laboratory values for 16 laboratory tests were analyzed, with mean values indicating elevations in markers of inflammation (C-reactive protein, 83 mg/L; 95% CI, 53-112 mg/L; ferritin, 417 ng/mL; 95% CI, 228-607 ng/mL; and procalcitonin, 1.45 ng/mL; 95% CI, 0.13-2.77 ng/mL). Abnormalities in coagulation were also evident (D-dimer, 0.78 ug/mL; 95% CI, 0.35-1.21 ug/mL; and fibrinogen, 477 mg/dL; 95% CI, 385-569 mg/dL). Cardiac troponin, when checked (n = 59), was elevated (0.032 ng/mL; 95% CI, 0.000-0.080 ng/mL). Common complications included cardiac arrhythmias (15.0%; 95% CI, 8.1%-21.7%), viral pneumonia (13.3%; 95% CI, 6.5%-20.1%), and respiratory failure (10.5%; 95% CI, 5.8%-15.3%). Few children were treated with COVID-19-directed medications. Conclusions and Relevance: This study of EHRs of children and youth hospitalized for COVID-19 in 6 countries demonstrated variability in hospitalization trends across countries and identified common complications and laboratory abnormalities in children and youth with COVID-19 infection. Large-scale informatics-based approaches to integrate and analyze data across health care systems complement methods of disease surveillance and advance understanding of epidemiological and clinical features associated with COVID-19 in children and youth.


Subject(s)
COVID-19/epidemiology , Electronic Health Records/statistics & numerical data , Hospitalization/statistics & numerical data , Pandemics , SARS-CoV-2 , Adolescent , Child , Child, Preschool , Female , Global Health , Humans , Infant , Infant, Newborn , Male , Retrospective Studies
15.
Sci Rep ; 11(1): 5322, 2021 03 05.
Article in English | MEDLINE | ID: covidwho-1118817

ABSTRACT

The COVID-19 pandemic has devastated the world with health and economic wreckage. Precise estimates of adverse outcomes from COVID-19 could have led to better allocation of healthcare resources and more efficient targeted preventive measures, including insight into prioritizing how to best distribute a vaccination. We developed MLHO (pronounced as melo), an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health Outcomes. MLHO implements iterative sequential representation mining, and feature and model selection, for predicting patient-level risk of hospitalization, ICU admission, need for mechanical ventilation, and death. It bases this prediction on data from patients' past medical records (before their COVID-19 infection). MLHO's architecture enables a parallel and outcome-oriented model calibration, in which different statistical learning algorithms and vectors of features are simultaneously tested to improve prediction of health outcomes. Using clinical and demographic data from a large cohort of over 13,000 COVID-19-positive patients, we modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics. The mean AUC ROC for mortality prediction was 0.91, while the prediction performance ranged between 0.80 and 0.81 for the ICU, hospitalization, and ventilation. We broadly describe the clusters of features that were utilized in modeling and their relative influence for predicting each outcome. Our results demonstrated that while demographic variables (namely age) are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model. As the COVID-19 pandemic unfolds around the world, adaptable and interpretable machine learning frameworks (like MLHO) are crucial to improve our readiness for confronting the potential future waves of COVID-19, as well as other novel infectious diseases that may emerge.


Subject(s)
COVID-19/mortality , Data Mining/methods , Machine Learning , Models, Statistical , Adult , Age Factors , Aged , Aged, 80 and over , COVID-19/diagnosis , COVID-19/therapy , COVID-19/virology , Electronic Health Records/statistics & numerical data , Female , Hospitalization/statistics & numerical data , Humans , Intensive Care Units/statistics & numerical data , Male , Middle Aged , Pandemics/statistics & numerical data , Prognosis , ROC Curve , Reproducibility of Results , Respiration, Artificial/statistics & numerical data , Retrospective Studies , Risk Assessment/methods , Risk Factors , SARS-CoV-2/isolation & purification , SARS-CoV-2/pathogenicity
16.
J Med Internet Res ; 23(3): e22219, 2021 03 02.
Article in English | MEDLINE | ID: covidwho-1088863

ABSTRACT

Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies. In addition, conventional statistical analyses cannot overcome the need for an understanding of the opportunities and limitations of EHR-derived studies. We distill here from the broader informatics literature six key considerations that are crucial for appraising studies utilizing EHR data: data completeness, data collection and handling (eg, transformation), data type (ie, codified, textual), robustness of methods against EHR variability (within and across institutions, countries, and time), transparency of data and analytic code, and the multidisciplinary approach. These considerations will inform researchers, clinicians, and other stakeholders as to the recommended best practices in reviewing manuscripts, grants, and other outputs from EHR-data derived studies, and thereby promote and foster rigor, quality, and reliability of this rapidly growing field.


Subject(s)
COVID-19/epidemiology , Data Collection/methods , Electronic Health Records , Data Collection/standards , Humans , Peer Review, Research/standards , Publishing/standards , Reproducibility of Results , SARS-CoV-2/isolation & purification
17.
J Am Med Inform Assoc ; 28(7): 1411-1420, 2021 07 14.
Article in English | MEDLINE | ID: covidwho-1075534

ABSTRACT

OBJECTIVE: The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity. MATERIALS AND METHODS: Twelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site. RESULTS: The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability-up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review. DISCUSSION: We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions. CONCLUSIONS: We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.


Subject(s)
COVID-19 , Electronic Health Records , Severity of Illness Index , COVID-19/classification , Hospitalization , Humans , Machine Learning , Prognosis , ROC Curve , Sensitivity and Specificity
18.
NPJ Digit Med ; 4(1): 15, 2021 Feb 04.
Article in English | MEDLINE | ID: covidwho-1065966

ABSTRACT

This study aims to predict death after COVID-19 using only the past medical information routinely collected in electronic health records (EHRs) and to understand the differences in risk factors across age groups. Combining computational methods and clinical expertise, we curated clusters that represent 46 clinical conditions as potential risk factors for death after a COVID-19 infection. We trained age-stratified generalized linear models (GLMs) with component-wise gradient boosting to predict the probability of death based on what we know from the patients before they contracted the virus. Despite only relying on previously documented demographics and comorbidities, our models demonstrated similar performance to other prognostic models that require an assortment of symptoms, laboratory values, and images at the time of diagnosis or during the course of the illness. In general, we found age as the most important predictor of mortality in COVID-19 patients. A history of pneumonia, which is rarely asked in typical epidemiology studies, was one of the most important risk factors for predicting COVID-19 mortality. A history of diabetes with complications and cancer (breast and prostate) were notable risk factors for patients between the ages of 45 and 65 years. In patients aged 65-85 years, diseases that affect the pulmonary system, including interstitial lung disease, chronic obstructive pulmonary disease, lung cancer, and a smoking history, were important for predicting mortality. The ability to compute precise individual-level risk scores exclusively based on the EHR is crucial for effectively allocating and distributing resources, such as prioritizing vaccination among the general population.

19.
Drugs ; 80(18): 1961-1972, 2020 Dec.
Article in English | MEDLINE | ID: covidwho-910395

ABSTRACT

BACKGROUND: Treatment decisions for Coronavirus Disease 2019 (COVID-19) depend on disease severity, but the prescribing pattern by severity and drivers of therapeutic choices remain unclear. OBJECTIVES: The objectives of the study were to evaluate pharmacological treatment patterns by COVID-19 severity and identify the determinants of prescribing for COVID-19. METHODS: Using electronic health record data from a large Massachusetts-based healthcare system, we identified all patients aged ≥ 18 years hospitalized with laboratory-confirmed COVID-19 from 1 March to 24 May, 2020. We defined five levels of COVID-19 severity at hospital admission: (1) hospitalized but not requiring supplemental oxygen; (2-4) hospitalized and requiring oxygen ≤ 2, 3-4, and ≥ 5 L per minute, respectively; and (5) intubated or admitted to an intensive care unit. We assessed the medications used to treat COVID-19 or as supportive care during hospitalization. RESULTS: Among 2821 patients hospitalized for COVID-19, we found inpatient mortality increased by severity from 5% for level 1 to 23% for level 5. As compared to patients with severity level 1, those with severity level 5 were 3.53 times (95% confidence interval 2.73-4.57) more likely to receive a medication used to treat COVID-19. Other predictors of treatment were fever, low oxygen saturation, presence of co-morbidities, and elevated inflammatory biomarkers. The use of most COVID-19 relevant medications has dropped substantially while the use of remdesivir and therapeutic anticoagulants has increased over the study period. CONCLUSIONS: Careful consideration of disease severity and other determinants of COVID-19 drug use is necessary for appropriate conduct and interpretation of non-randomized studies evaluating outcomes of COVID-19 treatments.


Subject(s)
COVID-19 Drug Treatment , COVID-19/mortality , Hospitalization , Adolescent , Adrenal Cortex Hormones/therapeutic use , Adult , Age Factors , Aged , Aged, 80 and over , Anticoagulants/therapeutic use , Antiviral Agents/therapeutic use , Biological Products/therapeutic use , Body Mass Index , COVID-19/epidemiology , Comorbidity , Comoros , Drug Therapy, Combination , Drug Utilization , Extracorporeal Membrane Oxygenation/statistics & numerical data , Female , Humans , Male , Middle Aged , Oxygen Inhalation Therapy/methods , Pandemics , Racial Groups , Respiration, Artificial/statistics & numerical data , Retrospective Studies , SARS-CoV-2 , Severity of Illness Index , Sex Factors , Smoking/epidemiology , Young Adult
20.
NPJ Digit Med ; 3: 109, 2020.
Article in English | MEDLINE | ID: covidwho-728999

ABSTRACT

We leveraged the largely untapped resource of electronic health record data to address critical clinical and epidemiological questions about Coronavirus Disease 2019 (COVID-19). To do this, we formed an international consortium (4CE) of 96 hospitals across five countries (www.covidclinical.net). Contributors utilized the Informatics for Integrating Biology and the Bedside (i2b2) or Observational Medical Outcomes Partnership (OMOP) platforms to map to a common data model. The group focused on temporal changes in key laboratory test values. Harmonized data were analyzed locally and converted to a shared aggregate form for rapid analysis and visualization of regional differences and global commonalities. Data covered 27,584 COVID-19 cases with 187,802 laboratory tests. Case counts and laboratory trajectories were concordant with existing literature. Laboratory tests at the time of diagnosis showed hospital-level differences equivalent to country-level variation across the consortium partners. Despite the limitations of decentralized data generation, we established a framework to capture the trajectory of COVID-19 disease in patients and their response to interventions.

SELECTION OF CITATIONS
SEARCH DETAIL